Preparing the data

## Rows: 369
## Columns: 5
## Key: State, Industry [1]
## $ State       <chr> "Northern Territory", "Northern Territory", "Northern Terr…
## $ Industry    <chr> "Food retailing", "Food retailing", "Food retailing", "Foo…
## $ `Series ID` <chr> "A3349527K", "A3349527K", "A3349527K", "A3349527K", "A3349…
## $ Month       <mth> 1988 Apr, 1988 May, 1988 Jun, 1988 Jul, 1988 Aug, 1988 Sep…
## $ Turnover    <dbl> 25.6, 26.7, 27.7, 28.2, 28.2, 29.4, 28.3, 27.4, 30.9, 26.7…

Analyzing the data

Overall there seems to be an upward trend between 1988 and 2019. However, around 1997 and 2000, there is a slight decrease in turnover. Furthermore, we can see that there is a sudden drop at the start of each year, showing a seasonality pattern which we can use for forecasting. The plot does not seem to show any cyclic behavior. The variation in the seasonal pattern also seems to increase as the level of the series rises.

From the season plot, we can identify that there is indeed a slight decrease during January-February every year. It can also be seen that around the period 1988 - 2003, there seems to be an increasing trend until reaching peak at around July-August every year. However, this is shown most apparent as time goes, specifically during the 2006-2019, which can be seen that turnover usually peak at July every year and decrease after that.

According to the subseries plot, we can verify that most of our observation for the previous plot are identical to this: small decrease during start of the year, increasing after and likely to peak around July then slowly decrease after.

From the ACF, we can see that there is a slow decay of significant positive values, indicating that this is indeed a trend series, and through the zoomed ACF plot, we can see that there are local increases at lag 12, 24, 36,…, which might indicate a series with yearly seasonality

From the PACF, we can see that there are significant spikes at lags 7, 13, 19 and 25, but the spike at 19 seems to be less significant compared to one at 7. We might lean more towards to the plot suggesting a yearly seasonality rather than a half-yearly seasonality.

Transforming and Differencing

For transformation, we want to perform a Box-Cox transformation that can minimize and keep the variation constant across the series, while also close to log (lambda = 0) to keep it simple. To choose a lambda value that can do so, we are going to use Guerrero’s method and testing out lambda value of 0 (log), 0.1 and -0.1.

From the observation of transformed plots, we can see that the positive value from 0.1 to 0 (log transformation) seems to show a decent transformation with both lower and higher level of series.

However, we also try lambda = -0.1, which seems the show better variation for both lower and higher level of series, therefore our selection can be -0.1. Guerrero’s method also generated a lambda approximately of -0.062, which is fairly close to our observation as well, so a range of -0.1 to -0.05 should do the transformation decently enough to help us modeling.

Since Guerrero method value is computed optimally through algorithms and it is close to our suggested lambda, we will use the Guerrero lambda value for continuing parts.

## # A tibble: 1 × 4
##   State              Industry       kpss_stat kpss_pvalue
##   <chr>              <chr>              <dbl>       <dbl>
## 1 Northern Territory Food retailing      6.15        0.01
## # A tibble: 1 × 3
##   State              Industry       nsdiffs
##   <chr>              <chr>            <int>
## 1 Northern Territory Food retailing       1

The unit-root test for our transformed data suggests that they are not stationary, hence, differencing process is likely required. Seasonal differencing unit-root test result seems to suggest that we should do 1 seasonal difference.

## # A tibble: 1 × 4
##   State              Industry       kpss_stat kpss_pvalue
##   <chr>              <chr>              <dbl>       <dbl>
## 1 Northern Territory Food retailing     0.252         0.1
## # A tibble: 1 × 3
##   State              Industry       ndiffs
##   <chr>              <chr>           <int>
## 1 Northern Territory Food retailing      0

Unit-root test after a seasonal differencing seems to indicate that our series is now stationary with these transformation and differencing, which is shown by our p-value > 0.05.

We also run another unit-root test for number of first differences but the result seems to suggest no first differencing is needed for our transformed and seasonal differenced time series.

The rapid decrease shown in ACF plot seems to suggest that our time series is relatively stationary now.

ETS modeling

Initially, our analysis on original time series identifies that there is trend and seasonality, but we can also identify that:

Therefore, it seems to be logical to use multiplicative seasonal and additive or damped additive trend for our ETS model.

Hence, from these factor, we can select out few assumptions for our ETS model:

We will fit to all these model and 1 auto model using ETS() and check their accuracies using 24-month test set.

## # A tibble: 4 × 12
## # Groups:   .model [4]
##   .model  State Industry .type     ME  RMSE   MAE    MPE  MAPE  MASE RMSSE  ACF1
##   <chr>   <chr> <chr>    <chr>  <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 ETS_a   Nort… Food re… Test  -2.61   3.45  2.86 -2.14   2.32   NaN   NaN 0.637
## 2 ETS_a_… Nort… Food re… Test  -4.00   4.78  4.01 -3.24   3.25   NaN   NaN 0.675
## 3 ETS_ad  Nort… Food re… Test   0.503  3.44  2.82  0.248  2.20   NaN   NaN 0.698
## 4 ETS_au… Nort… Food re… Test   0.503  3.44  2.82  0.248  2.20   NaN   NaN 0.698

The accuracy result seems to show that, in term of RMSE, the automatic ETS model and the Additive Dampened Trend ETS Model seems to be similar (in other statistics as well), while Additive Trend ETS model seems to come close. Therefore, we can select the auto model and check its parameter for trend dampening.

## # A tibble: 1 × 3
##     AIC  AICc   BIC
##   <dbl> <dbl> <dbl>
## 1 2456. 2458. 2522.
## Series: Turnover 
## Model: ETS(M,Ad,M) 
##   Smoothing parameters:
##     alpha = 0.5465767 
##     beta  = 0.01606069 
##     gamma = 0.0001096571 
##     phi   = 0.9799963 
## 
##   Initial states:
##      l[0]      b[0]      s[0]     s[-1]     s[-2]  s[-3]    s[-4]    s[-5]
##  26.45709 0.2628342 0.9781219 0.8808432 0.9216392 1.0352 0.973809 1.030827
##     s[-6]    s[-7]    s[-8]    s[-9]   s[-10]    s[-11]
##  1.019912 1.081409 1.090852 1.015931 1.007646 0.9638097
## 
##   sigma^2:  9e-04
## 
##      AIC     AICc      BIC 
## 2436.387 2438.485 2505.571

Additive and Additive Dampened Trend model in sequence, we can see that AIC and AICc score of Additive Dampened trend ETS model is lower, combining with other accuracy values are also better from the accuracy table, it is relatively safe to assume the Ad trend ETS model is most considerable.

Our auto ETS(M,Ad,M) model seems to have alpha = 0.546, beta = 0.016 and phi = 0.980 for Ad smoothing parameter.

## # A tsibble: 24 x 3 [1M]
##       Month .mean                  `80%`
##       <mth> <dbl>                 <hilo>
##  1 2017 Jan  114. [109.9176, 118.5323]80
##  2 2017 Feb  109. [104.5816, 114.0390]80
##  3 2017 Mar  122. [115.6505, 127.4219]80
##  4 2017 Apr  120. [113.5129, 126.3011]80
##  5 2017 May  126. [118.2304, 132.7955]80
##  6 2017 Jun  127. [118.7686, 134.6219]80
##  7 2017 Jul  136. [127.0742, 145.3201]80
##  8 2017 Aug  135. [125.5333, 144.8093]80
##  9 2017 Sep  128. [117.9860, 137.2673]80
## 10 2017 Oct  129. [118.8410, 139.4257]80
## # ℹ 14 more rows
## # A tibble: 1 × 5
##   State              Industry       .model        lb_stat     lb_pvalue
##   <chr>              <chr>          <chr>           <dbl>         <dbl>
## 1 Northern Territory Food retailing ETS(Turnover)    85.6 0.00000000762

However, checking our residual plot, we can identify few problems:

Hence when forecasting using this ETS model, we might need to be careful on forecasting value as there might be correlation when forecasting. Overall, from forecasted plot, we can see that it forecasts fairly accurate comparing with considerable prediction interval comparing to the test set.

ARIMA modeling

For ARIMA modeling, we are gonna look at our transformed and differenced time series to determine few ARIMA models for our time series data. Our previously unit root test says that there is no need for further differencing, even though it is not obvious that we should do a further differencing or not but we will keep using the seasonally differenced data to choose ARIMA models.

PACF shows a significant spike at lag 12 but nothing at seasonal lags in the ACF. This might suggests a seasonal AR(1) term. In the non-seasonal lags, we can see that there are 2 significant spikes in the PACF, suggesting a possible AR(2) term.

Therefore, our 3 possible ARIMA models can be:

  • ARIMA(2,0,0)(1,1,0)[12] as starting point
  • ARIMA(2,0,1)(1,1,1)[12]
  • ARIMA(2,0,1)(1,1,0)[12]

We will also run a stepwise and 1 wider search using ARIMA() function to find more possible model.

## # A mable: 5 x 4
## # Key:     State, Industry, Model name [5]
##   State              Industry    `Model name`                             Orders
##   <chr>              <chr>       <chr>                                   <model>
## 1 Northern Territory Food retai… arima1                <ARIMA(2,0,0)(1,1,0)[12]>
## 2 Northern Territory Food retai… arima2                <ARIMA(2,0,1)(1,1,1)[12]>
## 3 Northern Territory Food retai… arima3                <ARIMA(2,0,1)(1,1,0)[12]>
## 4 Northern Territory Food retai… arima_step   <ARIMA(1,0,2)(2,1,1)[12] w/ drift>
## 5 Northern Territory Food retai… arima_search <ARIMA(4,0,0)(0,1,2)[12] w/ drift>
## # A tibble: 5 × 6
##   .model         sigma2 log_lik    AIC   AICc    BIC
##   <chr>           <dbl>   <dbl>  <dbl>  <dbl>  <dbl>
## 1 arima_search 0.000472    798. -1580. -1580. -1550.
## 2 arima_step   0.000493    791. -1566. -1565. -1535.
## 3 arima2       0.000507    784. -1557. -1556. -1534.
## 4 arima3       0.000672    744. -1478. -1478. -1459.
## 5 arima1       0.000691    739. -1470. -1470. -1455.
## # A tibble: 5 × 12
## # Groups:   .model [5]
##   .model    State Industry .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>     <chr> <chr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 arima1    Nort… Food re… Test  -1.71  3.48  2.84 -1.50  2.33   NaN   NaN 0.723
## 2 arima2    Nort… Food re… Test  -5.44  6.61  5.49 -4.38  4.41   NaN   NaN 0.762
## 3 arima3    Nort… Food re… Test  -1.75  3.51  2.84 -1.52  2.33   NaN   NaN 0.724
## 4 arima_se… Nort… Food re… Test  -9.58 11.0   9.58 -7.62  7.62   NaN   NaN 0.811
## 5 arima_st… Nort… Food re… Test  -9.91 11.3   9.91 -7.88  7.88   NaN   NaN 0.816

Overall, we can see that search algorithm of ARIMA() result in the best AIC, AICc and BIC value, with our guess of ARIMA(2,0,1)(1,1,1)[12] coming quite close, and Stepwise model is the second best.

However, accuracy table when forecasting against test set shows that stepwise model seems to be a better choice in term of RMSE as well as other statistic, with search model is the second best here, and our guess of ARIMA(2,0,1)(1,1,1)[12] also perform relatively quite compare to other 2 guess.

We will use both search (ARIMA(4,0,0)(0,1,2)[12]) and stepwise model (ARIMA(1,0,2)(2,1,1)[12]) to check their residuals, ACF and Ljung-Box test.

## # A tibble: 1 × 3
##   .model     lb_stat lb_pvalue
##   <chr>        <dbl>     <dbl>
## 1 arima_step    40.1   0.00205
## # A tibble: 1 × 3
##   .model       lb_stat lb_pvalue
##   <chr>          <dbl>     <dbl>
## 1 arima_search    15.2     0.645

From both model residual plots and Ljung-Box test, we can confidently choose Search model (ARIMA(4,0,0)(0,1,2)[12]) as selection for ARIMA model as Stepwise model did not satisfy Ljung-Box test. Although Stepwise model residual seems to be similar to Search model in term of Histogram and ACF plot (with a few more insignificant lag than Search model), Ljung-Box test seems solidify our model as it satisfies the condition of p-value > 0.05 to be not distinguishable from a white noise series.

## # A tsibble: 24 x 3 [1M]
##       Month .mean                  `80%`
##       <mth> <dbl>                 <hilo>
##  1 2017 Jan  116. [112.0886, 120.8300]80
##  2 2017 Feb  111. [106.8153, 116.0501]80
##  3 2017 Mar  124. [118.4129, 130.0141]80
##  4 2017 Apr  123. [116.6719, 130.0132]80
##  5 2017 May  130. [122.6145, 137.5364]80
##  6 2017 Jun  132. [124.4350, 140.6490]80
##  7 2017 Jul  146. [136.2969, 155.2320]80
##  8 2017 Aug  143. [133.9200, 153.2227]80
##  9 2017 Sep  136. [126.6378, 145.5332]80
## 10 2017 Oct  138. [127.8873, 147.5856]80
## # ℹ 14 more rows

ETS vs ARIMA

Overall, both selected ETS and ARIMA model seems to perform relatively well against our 24-month test set, but ARIMA model prediction interval seems to lean towards increasing side, following the trend more, comparing to ETS model prediction interval, which covers both possible increasing trend and no trend factor. ETS model line also seems to fit better against test set line compared to ARIMA model line. On the other hand, if we take residual analysis into consideration, selected ARIMA model is likely to be better than ETS model.

However, one test set might not be enough to determine the viability of these 2 models. Hence, further comparison using more data is needed to see which model performs better.

Applying model to full dataset

## # A tsibble: 24 x 3 [1M]
##       Month `Forecasted Mean`                  `80%`
##       <mth>             <dbl>                 <hilo>
##  1 2019 Jan              116. [111.2767, 119.8647]80
##  2 2019 Feb              110. [105.5432, 114.6978]80
##  3 2019 Mar              124. [117.9789, 129.7725]80
##  4 2019 Apr              123. [116.4343, 130.2603]80
##  5 2019 May              130. [122.6576, 138.4906]80
##  6 2019 Jun              132. [123.7706, 141.2325]80
##  7 2019 Jul              145. [134.4403, 155.0964]80
##  8 2019 Aug              142. [131.1726, 152.5423]80
##  9 2019 Sep              133. [122.6076, 143.7242]80
## 10 2019 Oct              134. [122.5678, 144.8293]80
## # ℹ 14 more rows

## # A tsibble: 24 x 3 [1M]
##       Month `Forecasted Mean`                  `80%`
##       <mth>             <dbl>                 <hilo>
##  1 2019 Jan              115. [110.8563, 119.4092]80
##  2 2019 Feb              110. [105.2798, 114.6510]80
##  3 2019 Mar              122. [116.2903, 127.9421]80
##  4 2019 Apr              121. [114.1817, 126.8471]80
##  5 2019 May              126. [118.7978, 133.2116]80
##  6 2019 Jun              127. [119.2569, 134.9397]80
##  7 2019 Jul              137. [128.1093, 146.2379]80
##  8 2019 Aug              136. [126.2123, 145.3195]80
##  9 2019 Sep              128. [118.4393, 137.5292]80
## 10 2019 Oct              129. [119.1685, 139.5348]80
## # ℹ 14 more rows

Comparing to ABS data

ABS data ends at March 2023, while our dataset ends at Dec 2018, hence, we should produce forecast 4 years 3 months ahead with our model.

## # A tsibble: 12 x 4 [1M]
##       Month `Forecasted Mean` Turnover Difference
##       <mth>             <dbl>    <dbl>      <dbl>
##  1 2022 Apr              121.     140.      -18.7
##  2 2022 May              127.     141.      -14.6
##  3 2022 Jun              128.     144.      -16.4
##  4 2022 Jul              138.     158.      -19.7
##  5 2022 Aug              137.     151.      -14.2
##  6 2022 Sep              129.     143.      -14.0
##  7 2022 Oct              130.     144.      -14.2
##  8 2022 Nov              123.     139       -16.0
##  9 2022 Dec              130.     151.      -21.4
## 10 2023 Jan              116.     131       -15.0
## 11 2023 Feb              111.     125       -14.2
## 12 2023 Mar              123.     141.      -18.1

## # A tsibble: 12 x 4 [1M]
##       Month `Forecasted Mean` Turnover Difference
##       <mth>             <dbl>    <dbl>      <dbl>
##  1 2022 Apr              137.     140.      -2.94
##  2 2022 May              145.     141.       3.37
##  3 2022 Jun              147.     144.       2.65
##  4 2022 Jul              161.     158.       3.45
##  5 2022 Aug              158.     151.       7.17
##  6 2022 Sep              148.     143.       5.66
##  7 2022 Oct              149.     144.       4.98
##  8 2022 Nov              141.     139        1.98
##  9 2022 Dec              148.     151.      -3.10
## 10 2023 Jan              134.     131        2.94
## 11 2023 Feb              127.     125        2.36
## 12 2023 Mar              143.     141.       1.80

Each table is corresponded to the model of the forecasted plot above them.

Overall, our ARIMA model seems to forecast quite accurate to ABS data and the forecasted value seems quite close to real value with little difference. On the other hand, ETS model seems to forecast lower than ABS value, and it is consistently forecasting lower.

This might be a problem due to the dampening trend factor of ETS model, as well as our analysis also mentioned about correlation when forecast, which causes the forecast value of ETS model to not following the trend. One redemption point for ETS model is that ABS data line still stays within the prediction interval of ETS model.

Conclusion

Our ARIMA model seems to perform decently with this release of ABS data, given that it is still relevant after more than 4 years of forecast from our dataset cutoff, this model could be viable for future predictions unless there are anomalies such as economical changes that leads to data going different way.

On the other hand, the ETS model seems to perform worse when initially, the forecasted mean seems to be more close to test set than ARIMA model. However, when we perform the residual analysis and tests of ETS model, this is also within my expectation that the model will perform worse later.

Furthermore, most of our model forecast tests and accuracy measuring were conducted using 1 24-month test set only. It is indeed costing less computational power to perform tests and model comparisons this way. However, cross-validation process should have been done when training these models as well, but this process will cost more computational power when we perform cross-validation with high output dimension.

Our ETS model was also fitted using training data without transformation, so this could perhaps be a solution to get better model later, but it is not certain. This will require more testing outside of this report. Cross-validation might have resulted in better ETS model as well, we need further testing on this as well.